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Abstract. Morphisms to finite semigroups can be used for recognizing 
omega-regular languages. The so-called strongly recognizing morphisms 
can be seen as a deterministic computation model which provides min¬ 
imal objects (known as the syntactic morphism) and a trivial comple¬ 
mentation procedure. We give a quadratic-time algorithm for computing 
the syntactic morphism from any given strongly recognizing morphism, 
thereby showing that minimization is easy as well. In addition, we give 
algorithms for efficiently solving various decision problems for weakly 
recognizing morphisms. Weakly recognizing morphism are often smaller 
than their strongly recognizing counterparts. Finally, we describe the 
language operations needed for converting formulas in monadic second- 
order logic (MSO) into strongly recognizing morphisms, and we give 
some experimental results. 


1 Introduction 

Automata over finite words have a huge number of effective closure properties. More¬ 
over, many problems such as minimization or equivalence of deterministic automata 
admit very efficient algorithms mm- The situation over infinite words is quite sim¬ 
ilar, but with the major difference that many operations are less efficient. There 
are many different automaton models for accepting languages of infinite words, the 
so-called w-regular languages. Each of these models has its advantages and disad¬ 
vantages. For instance, deterministic Biichi automata are less powerful than non- 
deterministic Biichi automata M- And only very few automaton models admit 
efficient minimization algorithms; for example, minimization of deterministic finite 
automata can be applied to the lasso automata in |2]. 


*This work was supported by the DFG grants DI 435/5-2 and KU 2716/1-1. 
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The theory of finite semigroups and automata is tightly connected m- Since 
the semigroup for a language can be exponentially bigger than its automaton, semi¬ 
groups have very rarely been considered in the context of efficient algorithms. There 
is also an algebraic approach to w-regular languages by using morphisms to finite 
semigroups, see e.g. BM- Among the many nice properties of this approach are 
minimal morphisms — the so-called syntactic morphisms — and easy complementa¬ 
tion. As for finite words, the semigroup for an cj-regular language can be expo¬ 
nentially bigger than its Biichi automaton. However, since many operations for 
cu-regular languages are less efficient than for regular languages over finite words, 
the drawback of this exponential blow-up in size is less serious. This is even more 
so when minimizing all intermediate objects. 

A typical algorithm for computing the syntactic morphism of a regular language 
over finite words is to minimize the (deterministic) automaton defined by the Cayley 
graph of a morphism, and then the syntactic morphism is given by the transition 
semigroup of the minimal automaton. This approach does not work for infinite words 
and we therefore give a direct algorithm for computing the syntactic morphism. Our 
algorithm is an adaptation of Hopcroft’s minimization algorithm | 5 j and its running 
time is quadratic in the size of the semigroup. We show that this is rather optimal. 

There are two different modes for recognizing omega-regular languages by a mor¬ 
phism to a finite semigroup: weak and strong recognition. Strong recognition is a 
special case of weak recognition. Easy complementation and the computation of the 
syntactic morphism only works for strong recognition. We show how to test whether 
a given weak recognition is actually strong. Another useful tool for morphisms is 
the computation of the so-called conjugacy classes. 

As an application, we consider the translation of MSO formulas into strongly rec¬ 
ognizing morphisms. To this end, we show that a powerset construction preserves 
strong recognition, and that this construction can be used for computing the im¬ 
age under a length-preserving morphism. Finally, we give the test results of some 
translations from MSO to strong recognition. Deciding the satisfiability of an MSO 
formula is non-elementary m and therefore, minimization of intermediate objects 
is usually very helpful for solving some special cases. This is confirmed by our test 
results. 

2 Preliminaries 

Words. Let A be a finite alphabet. The elements of A are called letters. A finite 
word is a sequence 0102 ■ ■ ■ an of letters of A and an infinite word is an infinite se¬ 
quence 0102 •••. The empty word is denoted by s. The set of non-empty finite words 
over A is A'^. Let A be a set of finite words and let L be a set of infinite words. We set 
KL = {ua I u G A, a G L}, A+ = {uiU2 ■ ■ ■ Un\ nfi^ \,Ui ^ A} and A* = A+ U {e}. 
Moreover, if e 0 A we define the infinite iteration = {uiU2 ■ • • | Uj G A}. A 
natural extension to A C A* is A^ = (A \ {e})^. 
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Finite semigroups. Let S' be a finite semigroup. An element e of S is idempotent if 

= e. The set of idempotent elements of S is denoted by E{S) = {e € S | = e}. 

For each s € S the set | /c ^ l| of all powers of s is finite and it contains exactly 
one idempotent element. 

A semigroup S is called X-generated if A is a subset of S and every element of 
S can be written as a product of elements of X. The right Cayley graph of an X- 
generated semigroup S has S as vertices and its labeled edges are the triples of the 
form (s, a, sa) for s € S and a £ X. The left Cayley graph of S is defined analogously 
with edges of the form (s,a,as). The definitions of Cayley graphs depend on the 
choice of the set X. In the following, when a surjective morphism h: A'^ S is 
given, we choose X = h{A) as the set of generators. 

Creen’s relations are an important tool in the study of finite semigroups. We 
denote by the monoid that is obtained by adding a new neutral element 1 to S. 
For s,t ^ S let 

s TZt if there exist q, q' £ such that sq = t and tq' = s, 

s C t if there exist p,p' £ such that ps = t and p't = s. 

These relations are equivalence relations and the equivalence classes of TZ (resp. C) 
are called IZ-classes (resp. C-elasses). The 7 ^-classes (resp. £-classes) of a semigroup 
S can be computed in time linear in ISj by applying Tarjan’s algorithm to the right 
(resp. left) Cayley graph of 5 , see | 1 ]. 

An element (s,e) £ S x E{S) is a linked pair if se = s. Two linked pairs (s,e) 

and (t,/) are conjugate, written as (s,e) ~ (t,/), if there exist x,y £ S such that 

sx = t, xy = e and yx = /. The conjugacy relation ~ on the set of linked pairs is an 
equivalence relation, see e.g. [ 5 ]. The equivalence classes of ~ are called conjugacy 
classes. A set P of linked pairs is closed under eonjugation if it is a union of conjugacy 
classes. 

Recognition by morphisms. A language L C is regular (or uj-regular) if it is rec¬ 
ognized by some finite Biichi automaton, see e.g. [ 3 ]. The family of regular languages 
is closed under Boolean operations, i.e., set union, set intersection and complemen¬ 
tation. We now describe algebraic recognition modes for regular languages. Let 
h: A+ —>■ S' be a morphism onto a finite semigroup S. For s £ S, we set [s] = h~^{s) 
and for P C S x S, we set 

[p|= U W[*i“ 

is,t)eP 

if h is understood from the context. A language L C is weakly recognized by 
a morphism h : A'^ —S if there exists a set of linked pairs P <£ S x E(S) with 
L = [P]. If in addition P is closed under conjugation, then h strongly recognizes L. 
Another well-known characterisation of strong recognition is the following. 


3 


Proposition 1 Let h : A'^ S he a morphism onto a finite semigroup. Then h 
strongly recognizes L if and only if [s] n L 7 ^ 0 implies [s] [t]'^ CL for all s,t € S. 


Proof. For the direction from left to right, we have L = [P] for some set P that is 
closed under conjugation. Let a, fi € [s] for some s,t £ S and let n ^ 1 such that 
t"" G E{S). Note that is a linked pair and we also have a, fi £ [st”] 

It suffices to show that a £ L implies fi £ L. li a £ L, there exist a linked 
pair (r, e) £ P and a factorization a = uviv{v 2 V 2 ■ ■ ■ with h(u) = st^, h(uvi) = r, 
h{viv[) = t"' and /i(u-Uj+i) = e for all i 1. Additionally, since S is finite, there exist 
indices i,j with 1 ^ i < j such that h{vi) = h{vj). We set x = h(vi) = h{vj) and 
y = h{v[viJ^i ■ ■ ■ Vj-iv'j_fij. Now, sf^x = st^^'x = h{uviv[ ■ ■ ■ Vi-iv^_iVi) = = r. 

By a similar argument, we get xy = and yx = e. Thus, is contained in 

P and we have fi £ L. 

For the converse implication, we define P as the union of all linked pairs (s, e) with 
[s][e]“ C L. Let (s,e) £ P and let (t,/) be a linked pair such that (s,e) and (t,/) 
are conjugate, i.e., sx = t, xy = e and yx = f for some x,y £ S. Since h is onto, 
there exist words u,v,w £ A~^ such that h{u) = s, h{y) = x and h{w) = y. Now, 
the infinite word u{vw)^ = uv{wv)^ is contained in the intersection [s][e]‘^ n [t][/]‘^ 
and by assumption we have [t][/]‘^ C L. This shows that (t, /) is in P. □ 

The syntactic congruence =l oi a language L C A“ is defined over A~^ as u=l v 
if the equivalences 


{xuy)z‘^ £ L ^ {xvy)z‘^ £ L and 

z{xuyY £ L z{xvyY G L 

hold for all finite words x,y,z £ A*. Our definition is slightly different but equiva¬ 
lent to the syntactic congruence introduced by Arnold [I]. The congruence classes 
oi =L form the so-called syntactic semigroup A~^/=l and the syntactic morphism 
hi: A'^ —)■ A'^l=L is the natural quotient map. If L is regular, the syntactic semi¬ 
group of L is finite and h^ strongly recognizes L mu- 

Model of computation. Morphisms /i: A+ ->■ 5 are given implicitly through a 
mapping f: A ^ S with /(a) = h{a) for all a £ A. We assume that for finite 
semigroups S', multiplications can be performed in constant time. Some algorithms 
only perform multiplications of the form h{a) • s or s • h{a) where L is a morphism, 
s is an element of S and a is a letter. In that case, semigroups can be represented 
efficiently by their left and right Cayley graphs. For two elements s,t £ S we can 
check in constant time whether s = t and it is possible to organize elements of S 
in a hash map such that operations on subsets of S can be implemented efficiently. 
When a set P C S' x 5 is part of the input, we assume that for each s,t £ S one can 
check in constant time whether (s, t) £ P. 
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3 Conversion between Buchi automata, weak and strong 
recognition 

In this section, we describe well-known constructions for the conversion between the 
different acceptance modes for regular languages. For details and proofs, we refer 

to [SlElIIlj. 

3.1 From Biichi automata to strong recognition 

In the case of finite words, when proving that each regular language is recognizable by 

a morphism onto a finite semigroup, one usually considers the transition semigroup 

of a finite automaton. However, when applying the same construction to Biichi 

automata, the resulting morphism only weakly recognizes the language. In this 

section, we describe a construction to convert a Biichi automaton A = (Q, A, S, I, F) 

into a semigroup S and a morphism h: A'^ —?• S that strongly recognizes L{A). 

For states p,q ^ Q and a finite word u E , we write p ^ q A there exists a 

sequence ^001^10292 • • • qn-i^nqu with qq = p, = q and (g*, Uj+i, Qj+i) E 5 for all 

i E { 0 ,..., n — 1 }. If, additionally, qt € F for some i E { 0 ,..., n}, we write p —> q. 

F 

We now assign to each word u E A'^ a Q x Q matrix h{u) defined by 

{ 1 if p A g but not P ^ q 
2 if p q 
0 otherwise 

A routine verification shows that this naturally extends the image of A'^ under h 
to a semigroup S. We say that a linked pair {R, E) where R = {rpg)p^q^Q and 
E = {epq)p^q^Q is accepting if there exist states p,q € Q such that Cpg ^ 1 and 
Cqq = 2. One can now verify that the set P of all accepting linked pairs is closed 
under conjugation and that [P] = L{A). 

3.2 From weak recognition to Biichi automata 

Suppose we are given a morphism h: A'^ —>■ S onto a finite semigroup S that weakly 
recognizes a language L, i.e., L = [P] for some set of linked pairs P C S x E{S). 
One can use the following construction from [U] to obtain a Biichi automaton A with 
L{A) = L. 

The set of states is Q = x P(S'), the set of initial states is / = P and the set 
of final states is P = { 1 } x E{S). The transition relation 5 consists of all tuples of 
the form ((s, e), a, (t, e)) € Q x A x Q where h{a)t = s or h{a)t = se. 

By combining the constructions from this and the previous subsection, we also 
obtain a construction to convert a morphism that weakly recognizes a language L 
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into a morphism that strongly recognizes L. There are also direct, more efficient 
constructions, to perform this conversion, see e.g. [S] . The converse direction is trivial 
since, by definition, a morphism h: A'^ —)■ S that strongly recognizes a language L 
also weakly recognizes L. 

4 Computing conjugacy classes 

When designing an algorithm that takes a set of linked pairs P C 5 x E(S) as input, 
it is often convenient to assume that P is closed under conjugation. However, this 
is not always the case in practice: The input set P might be a proper subset of its 
closure under conjugation Q such that [P] = [Q]. In this section, we describe an 
algorithm to compute the conjugacy classes efficiently. It justifies the assumption 
that P is always closed under conjugation in the following sections, particularly in 
Section El 

As a warm-up, we first describe how to compute the set P of linked pairs. The 
linked pairs are exactly the pairs of the form (se, e) with s G P and e € E(S). Thus, 
we first check for each element e € S whether = e. If the outcome of the check 
is positive, we perform a depth-first search in the left Cayley graph of S, starting 
at element e. For each element s that is visited, (s,e) is a linked pair. The total 
running time of this routine is 0(|P| + |A| • |P|). 

An equivalence relation = on the set of linked pairs is called left-stable if for all 
p G S and for linked pairs (s,e), (t,/) with (s,e) = (t,/), we have {ps,e) = {pt,f). 
We define an equivalence relation ~ on the set of linked pairs by (s,e) ~ {t,f) if 
and only if e £ s TZ t £ f or {s, e) = {t, /). Its relationship to conjugacy is captured 
in the following Lemma: 

Lemma 2 The conjugacy relation ~ is the finest left-stable equivalence relation 
coarser than ~. 

Proof. It follows directly from the definitions of linked pairs and conjugacy that ~ is 
left-stable. Let (s,e) and (t,/) be linked pairs with (s,e) (L/) and (s,e) {t,f)- 

Since s TZ t, there exist q, q' G such that sq = t and tq' = s. We set x = eq and 
y = fq'. Now, sx = seq = sq = t. Moreover, since s £ e, there exists p G with 
ps = e. Thus, we have xy = eqy = psqy = pty = ptfq' = ptq' = ps = e. A similar 
argument can be used to show that yx = /. Hence, (s,e) and (t,/) are conjugate, 
and ~ is indeed coarser than ~. 

In order to show that ~ is the finest relation with these properties, we consider an 
arbitrary left-stable equivalence relation ~ on the set of linked pairs which is coarser 
than «. We show that (s,e) ~ (t,/) implies (s,e) ~ (L/)- Let x,y G S such that 
sx = t, xy = e and yx = /. Then we have ex = xyx = xf and xfy = xyxy = = e, 

which shows that e TZ xf. Furthermore we have xf £ /, since yxf = f^ = f- By 
the definition of ~, this means that (e, e) ~ {xf, f) and since ~ refines ~, it follows 
that (e, e) ~ {xf,f). Left-stability yields (s,e) = (se, e) ~ {sxf,f) = (t,/). □ 
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Since 7^-classes and /i-classes can be computed in time linear in the size of the 
semigroup, this allows us to efficiently compute the conjugacy classes as shown 
in Algorithm [TJ We use a so-called disjoint-set data structure that provides two 
operations on a partition. Find(s, e) returns a unique element from the class that 
contains (s,e), i.e., if (s,e) and (t,/) are in the same class, we have Find(s,e) = 
Find(t,/). Union((s, e), (t,/)) merges the classes of (s,e) and (t,/). To simplify 
the notation we also introduce an operation Union'’“(i?) for subsets i? of S' x 5 that 
merges all classes with elements in R. Union'’“(i?) can be implemented using \R\ — 1 
atomic Union operations. The partition is initialized with singleton sets {(s,e)} 
for all linked pairs (s,e). The second data structure used in the algorithm is a set 
T C 2^. 

Algorithm 1 Computing conjugacy classes 

initialize T with the non-trivial equivalence classes of ~ 
for all R ^ T do Union'’'(ii) end for 
while T 7 ^ 0 do 

remove some set R from T 
for all a G A do 
i?' ^ 0 

for all (s,e) G i? do i?' ■(— i?' U {Find(/i(a)s, e)} end for 
if |ii'| > 1 then 
Union“''(i?') 
r ^ T U {i?'} 

end if 
end for 
end while 


To prove the correctness and running time of the algorithm, one can combine 
Lemma E] with arguments similar to those given in the correctness and running time 
proofs of the Hopcroft-Karp equivalence test [B]. We first show that the relation 
induced by the final partition is left-stable: 

Lemma 3 Let (s, e) and (t, /) be linked pairs of the same class upon termination, 
then, for each a £ A, the pairs {h{a)s, e) and {h{a)t, /) are in the same class as well. 


Proof. We write Findi(s,e) = Findj(t,/) if (s,e) and (t,/) belong to the same 
class after the Tth iteration of the while-loop. The index oo is used to describe the 
situation upon termination. 

Let i be minimal such that for some pairs (s, e), (t, /) and a letter a G A, we have 
Findj(s,e) = Findj(t,/) and FindcxD(/i(a)s, e) ^ F indoo {h{a)t, f). Note that z > 0 
because otherwise, a set containing both (s, e) and {t, /) would be added to T during 
initialization. Hence, there exists a pair {s',e') with Findj_i(s^, e^) = Findj_i(s,e) 
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and a pair with Findi_i(t',/') = Findj_i(t,/) such that Union'''(i?) is exe¬ 

cuted for some set R D {(s',e'), (t',/')}. By choice of i, we have Findoo(^(a)s, e) = 
FindooC/iCa)^', e') and Findoo(^(a)i,/) = Findoo(^(a)^^/O- Since we add the set R 
to T in iteration i, the equality Findoo(/i(o)s', e') = Fmdoo{h{a)t', f') holds as well, 
and thus Findoo(/i(a)s, e) = Findoo(^(a)i,/), a contradiction. □ 

There is of course a dual statement for the pairs (s • h{a), e) and {t ■ h{a), /). 

Theorem 4 Let F be the set of linked pairs of S. When Algorithm\J\ terminates, 
the classes of the partition correspond to the conjugacy classes of F. Furthermore, 
the algorithm executes at most 

► |F| — 1 Union operations and 

► 2 |A| (|F| — 1) Find operations. 

Proof. By Lemma El the relation induced by the final partition is left-stable and 
throughout the main algorithm, two classes are only merged when required to es¬ 
tablish this property. Thus, the relation is the finest left-stable equivalence relation 
coarser than ~ and, by Lemma [21 equivalent to the conjugacy relation. 

The number of Union operations is bounded by 1^1 — 1 since each operation reduces 
the number of classes in the partitions by 1. Let Ri,..., Rk be the sets that are 
added to T during the execution of the algorithm. Whenever one of the sets Ri is 
inserted into T, \Ri\ — 1 Union operations are executed. Thus, we have 

k 

^{\R,\-l)^\F\-l. 

i=l 

When Ri is removed from T, exactly |A| • \Ri\ Find operations are executed in the 
same iteration of the while-loop. The total number of Find operations is therefore 
bounded by 

k k 

^ 1^1 |A| • (2 - 2) ^ 2 |A| • (|F| - 1) 

i=l i=l 

where the first inequality follows from the fact that each of the sets Ri contains at 
least two elements. □ 

A sequence of n Union- and m Find-operations can be performed in 0{n + m ■ a{n)) 
time where a{n) denotes the extremely slow-growing inverse Ackermann function US]. 
Thus, when considering a fixed-size alphabet, the total running time of our algorithm 
is “almost linear” in the number of linked pairs. 

5 Testing for strong recognition 

Common decision problems, such as the universality problem or the inclusion prob¬ 
lem, are easy in the case of strong recognition. In the context of weak recognition. 


the algorithm presented in this section is a powerful tool to answer a broad range of 
similar problems. Given a morphism h: A'^ —)■ S onto a finite semigroup S and two 
sets of linked pairs P,QCSx E{S), it can be used to check whether [P] C [Q], In 
particular, it allows for testing whether the morphism strongly recognizes a language 
L = [P] by first computing the closure Q of P under conjugation and then using 
the algorithm to test whether [Q\ C [P], 

Before we present the algorithm, we remark that inclusion is not only a property 
of the semigroup S and the sets P and Q but it also depends on the set of generators 
h{A). In order to see this, we consider the finite semigroup S = | 1 ^ i, j ^ 2} 

with the multiplication given by (i,j) • {k,£) = {i,i) for all i,j,k,£ G {1)2}. Let 
A = (a, 6} and let h: A'^ —>■ 5 be the surjective morphism defined by h{a) = (1,2) 
and h{b) = (2,1). We consider the two sets of linked pairs P = {((1,1), (1,1))} and 
Q = {((1,2), (2,2))}. It is easy to check that [P] = [Q] = However, if 

we add a new letter c to H and extend h by setting h{c) = (1,1), the infinite word 
is contained in [P] but not in [Q], which implies [P] % [Q]. The morphism (or 
another description of the set of generators) thus needs to be part of the input of 
any algorithm performing the inclusion test described above. 

Let us now describe the algorithm. It maintains two sets R,T S x x . 
The former keeps record of the elements that are added to T during the course of 
the algorithm. To simplify the presentation, we define x ■ a~^ to be the set of all 
elements p £ which satisfy the equation p ■ h{a) = x. 


Algorithm 2 Testing for strong recognition 

initialize R and T with the set {(s, e, 1) | (s, e) G P} 

while T 0 do 

remove some element (s, x, y) from T 
if X = 1 then return “[P] % [Q]” end if 
if (sx, yxyx) ^ Q then 

for all a £ A, p G X ■ a~^ do 

if {s,p, h{a)y) ^ R then add {s,p, h{a)y) to R and to T end if 

end for 
end if 
end while 
return “[P] C [Q]” 


The following technical Lemma is crucial for the correctness proof of the algorithm: 


Lemma 5 Let u,v £ A'^ and let (s,e) and {h{u),h{v)) be linked pairs. Then uv^ 
is contained in [s][e]‘^ if and only if there exists a factorization v = V 1 V 2 such that 
vi e, h{uvi) = s and h{v 2 vvi) = e. 
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Proof. Let v = oia 2 • • • fln with n ^ 1 and a* € A. If is contained in [s][e]‘^, 
there exists a factorization uv‘^ = u'v'iv '2 ■ ■ ■ such that h{u') = s and h{v'j) = e for 
all i ^ 1. Since u and v are finite words, there exist indices j > i ^ 1, powers 
k,l ^ 1 and a position m E {1,..., n} such that — uv^aia 2 ■ ■ ■ a-m 

and • • • u'- = am+iam +2 ■ ■ ■ a„u^aia 2 • • • am- We set vi = 0102 ■ ■ ■ am and V 2 = 

^m+i^m +2 * * * a-fi. Then V 1 V 2 — u, 

h{uvi) = h{uv^aia 2 ■ ■ ■ am) = h{u'v[v 2 ■ ■ ■ v^_i) = = s and 

h{v 2 vvi) = h{am+iam +2 ■ ■ ■ anV^aia 2 ■ ■ ■ am) = • • • Vj) = = e. 

To prove the converse direction, consider the factorization uv^ = uvi{v 2 vvi)^. □ 

To simplify the proofs of the following two Lemmas, we extend /i to a monoid 
morphism h^: A* ^ by setting = h{u) for all u E A'^ and h}{£) = 1. 

Lemma 6 If the difference [P]\[<3] is non-empty, the algorithm returns “[P] % [Q] 


Proof. By the closure properties of regular languages, we know that there exists a 
word a = u{aia 2 ■ ■ ■ a-n)^ E [P] \ [Q]. Let s = h{u) and e = h{aia 2 ■ ■ ■ an). Lemma [S] 
shows that we can assume without loss of generality that (s,e) is contained in P. 
We now prove by induction on the parameter k that upon termination, we have 
(s, h^{aia 2 ■ ■ ■ Ofc), h^{ak+iak +2 ■ • • On)) € R for all /c E {0,..., n}. In particular, by 
considering the case A: = 0, we see that the element (s,l,e) is added to R. Since 
every element added to R is also added to Q, the algorithm returns “[P] ^ [Q]”. 

The base case k = n is covered by the initialization of the set R. Let now 
k < n, X = h^{aia 2 ■ ■ ■ ak+i) and y = h^{ak+ 2 ak+z ■ • • an)■ By the induction hy¬ 
pothesis, we know that the tuple {s,x,y) is added to T during the course of the 
algorithm. Consider the iteration when this tuple is removed from T. Because 
of a ^ [Q], we know that {sx,yxyx) ^ Q. Thus the inner loop guarantees that 
(s, h^{aia 2 ■ ■ ■ ak), h^{ak+iak +2 ''' o-n)) is added to R. □ 

Lemma 7 If the algorithm returns “[P] % [Q] ”, the difference [P]\[Q] is non-empty. 


Proof. We construct a word in the difference [P] \ [Q]. For every triple (s, e, 1) that 
is added to R during the initialization, we define w[s, e, 1] = s. If a triple (s,p, h{a)y) 
is added to R later, we set w[s,p, h{a)y] = a ■ w[s,p- h{a),y]. For every (s, x, y) 0 P, 
the word w[s,x,y] is undefined. For the other words, well-definedness follows from 
the fact that each triple (s, x, y) is added to R at most once. Furthermore, if w[s, x, y] 
is defined, its image under h^ is y and we have (s, xy) E P. Both properties are easy 
to prove by induction. 

Let (s, l,y) be the triple that was removed from T immediately before the termi¬ 
nation of the algorithm. Consider an arbitrary word u E [s] and set v = w[s,l,y]. 
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We have {s,y) € P and thus uv^ € [P]. For every factorization v = viav 2 where 
vi,V 2 G A* and a € A, the word r(;[s, h^{vi), h^{av 2 )] is defined as av 2 and thus, the 
tuple {h(uvia), h(v 2 vvia)) is not contained in Q. In view of Lemma EJ this shows 
that uv^ ^ [Q]- n 

We are now able to state the main result of this section: 

Theorem 8 Given a morphism h: A'^ S onto a finite semigroup S and two 
sets of linked pairs P,Q C S x E{S), one can check in 0(|^| • |5'|^) time whether 

[p] c [g]. 

Proof. The correctness of Algorithm|2]follows from the previous two Lemmas. Since 
R contains at most (|5| + 1)^ elements when the algorithm terminates, the outer 
loop is executed at most (|5| + 1)^ times. Moreover, for all a G A and s,t G S with 
s t, the sets s-a~^ and t-a~^ are disjoint. Thus, each element p £ is considered 
at most |A| • (|S'| + 1)^ times in the inner loop. If R is implemented as a bit field 
and T is implemented as a linked list, all operations take constant time. This shows 
that the total running time is in 0(|A| • □ 

6 Computation of the syntactic morphism 

In this section, we present an algorithm to compute the syntactic semigroup for a 
given language. The syntactic homomorphism is obtained as a byproduct. One can 
show that the syntactic semigroup is the smallest semigroup strongly recognizing a 
language IDE], so this operation is similar to the minimization of finite automata. 
The most important difference is that our algorithm requires only quadratic time, 
whereas minimization is PS PACE-hard in the case of Biichi automata ID El- 

Let 5 be a finite semigroup, let h: A'^ ^ S be a surjective morphism and let P be 
a set of linked pairs that is closed under conjugation. To make the following notation 
more readable, we define Q as the maximal subset of 5 x S' such that [P] = [Q]. 

Lemma 9 Let u,v £ A+. Then uv‘^ £ [P] if and only if {h{u), h{v)) £ Q. 

Proof. Suppose that uv^ £ [P]. By Proposition [D we have [h{u)][h{v)]‘^ C [P] = 
[g]. Since Q is maximal, the pair (h(u),h{v)) is contained in Q. The converse 
implication is trivial. □ 

We now define a equivalence relation = on S by s = t if for all z £ S, we have 

{z, s) £ Q 4^ {z,t) £ Q and 
(s, z) £ Q {t, z) £ Q. 

Moreover, let = be the coarsest congruence on S that refines =, i.e., s = t if xsy = 
xty for all x,y £ S^. We denote by [s]= the equivalence class {t G S | t = s} of an 
element s £ S. The relation = is closely related to the syntactic congruence, as 
confirmed by the following result: 
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Proposition 10 The quotient semigroup S/= is isomorphic to 1=^. 

Proof. We first define a morphism g: S/= by setting g{u) = [h{u)]= for 

all u E A'^. Let now u,v a A~^. By Lemma El we have h{u) = h(v) if and only if 
hiiu) = hL{y). Thus, g o is a semigroup isomorphism. □ 

The computation of the syntactic semigroup requires two steps: 

1. Compute the partition induced by the equivalence relation =. 

2. Refine the partition until the underlying equivalence relation becomes a con¬ 
gruence. 

The first step can be performed in time quadratic in the size of the semigroup. 
For the second step, we can adapt Hopcroft’s minimization algorithm for finite 
automata [S]. For CCS and a A, we define 

C • = {s E 5 I s • h{a) E C} and • C = {s E 5 | h{a) ■ s G C}. 

The full algorithm is shown in Algorithm El It relies on the Split routine that is 
usually implemented as part of a partition refinement data structure, see e.g. [S] 
for details. Its semantics is shown in Algorithm 01 In addition to modifying the 
partition, that routine also updates a set T C 2^ that is used in the main algorithm. 


Algorithm 3 Computing the syntactic semigroup 
initialize a partition with a single class S 
for all s E S' do 

Split({t E S I (s,t) E Q}) 

Split({t E S I (t, s) E Q}) 
end for 

initialize T with the non-trivial classes of the partition 
while T 7 ^ 0 do 

remove some set C from T 

for all a E A do 

Split(C' ■ a“^) 0 Refine the partition and update T 

Split(a“^ • C) 0 Refine the partition and update T 

end for 
end while 


The next Lemma shows that upon termination, the equivalence relation induced 
by the partition is indeed a congruence: 

Lemma 11 If, upon termination, the elements s and t belong to the same class of 
the partition, then, for each a € A, the elements h{a)s and h{a)t are in the same 
class as well. 
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Algorithm 4 The Split operation to refine a partition V 
procedure Split(A) 
for all C G P do 

Cl ^ C n A, Ca ^ C \ A 

if Cl / 0 and Ca 7 ^ 0 then 
P^(P\{C})U{Ci,Ca} 

if C G T then 

r^(r\{c})u{Ci,C2} 

else 

if |Ci| ^ |Ca| then T ^ T U {Ci} else T ^ T U {Ca} end if 

end if 
end if 
end for 

end procedure 


Proof. Suppose that h{a) ■ s and h(a) ■ t belong to different classes. These elements 
are split either during the initialization or in the main loop. In either case, a set C 
that contains either h{a) ■ s or h{a) ■ t is added to T. When this set is removed from 
r, the operation Split(a“^ • C) asserts that s and t lie in different classes as well. □ 

There is of course a dual statement for the elements s ■ h{a) and t ■ h{a). 

Theorem 12 The syntactic morphism can be computed in 0(|S'|^ + |A| • I^I log \S\) 
time. 


Proof. Let us first argue that Algorithm [3] is correct. The partition is initialized 
with the equivalence classes of =. A class is only split when it is necessary to restore 
the left-stability or right-stability. Upon termination, the relation induced by the 
partition is a congruence, as stated in Lemma flTl Thus, it is the coarsest congruence 
that refines = and hence equivalent to =. 

For the analysis of the running time, we assume that the operation Split(A) can 
be implemented in time linear in |A|. Then the initialization clearly takes 0(|S'|^) 
time. We denote by Ci,... ,Ck the sets that are added to T during the course of 
the algorithm. Let s € S and let Ug = {i | 1 ^ f ^ A:, s G Ci} be the number of sets 
Ci containing s. At any point in time, there is at most one set in T that contains s. 
If such a set C is removed from T and another set C with s G C' is added to T at 
a later point in time, we have that \C'\ ^ |C| /2. Thus, the inequality n* ^ log l^j 
holds for all s G 5* and we have 
k 


i=la&A 


-1 


+ 


-1 


•C,: 


{ns-hia) + nh(a).s) ^ 2 |A| • IS"! log |5|. 


Consequently, the total running time of the while-loop is in 0(|A| • |5|log|S'|), as¬ 
suming that T is implemented efficiently, e.g. as a linked list. □ 
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If the alphabet A is fixed and the semigroup S becomes large, the running time 
is dominated by the initialization. However, one can show that the algorithm we 
presented is quite optimal. Before we start with the proof of the optimality result, 
we need the following technical Lemma that asserts the existence of a semigroup 
with certain properties: 

Lemma 13 For every n ^ 4 there exist a semigroup T with • 2” + n elements 
and a set D C E(T) sueh that the following properties hold: 

1. T has rank 2, i.e., T is X-generated for some X FT with |X| = 2. 

2. \D\ = 2’"-b 

3. For all e, f G D and x,y gT, we have e = f or xy ^ e or yx ^ f. 

Proof. Let n ^ 4 and let = {0,... , n — 1}. Let T be the sei N x 2^ x N {J N. 
We denote by + be the addition modulo n which can be extended to T as follows: 

(i, X, j) + {k, y, f) = (i, X u {j + A:} u y, f) 
j) + k = {i,X,j + k) 
i + (j, X, k) = {i + j, X, k) 


for all i,j,k,i G N and X, y C X. It is easy to check that this operation is 
associative and thus, (T, +) forms a semigroup. The number of elements of T is 
• 2”' + re. One can also easily verify that T is {1, (0,0,0)}-generated. 

Now, consider the set D of all elements of the form (0, X, 0) for 0 G X C X. 
We have (0, X, 0) • (0, X, 0) = (0, X U {0} U X, 0) = (0, X, 0) and thus, D C E{T). 
The number of elements in D is 2"'“^. To show property El we assume that there 
exist E,E C N and (i, X, j), (fc, Y,£) G T such that {i, X,j) ■ {k, Y, i) = (0, E, 0) and 
{k,Y,i) ■ (i,X,j) = (0,F, 0). By the definition of the operation + on T, this implies 
i = j = k = £ = 0. Moreover, we have E = X U {0} U y = y U {0} U X = E. The 
other cases (x G N or y G X) are similar. □ 

We now use the previous Lemma to construct another semigroup with four gen¬ 
erators and a large number of conjugacy classes. 

Lemma 14 Let A = |a, 6, a, let c G N and let X G be a strictly positive 
number. Then there exist a semigroup S and a surjective morphism g: A'^ —>■ S, 
such that S has more than c- eonjugacy elasses. 

Proof. We first define B = {a, b}, B = |a, and choose re ^ 4 such that 32cre^ < 
2^"'. Let r be a finite semigroup and let 11 be a subset of E{T) with the properties 
described in Lemma [T31 Let h: B~^ T he a surjective homomorphism. We denote 
by T a disjoint copy of T and by h the morphism h: B ^ T induced by h. Now we 
define S = (T^ x 1) U (1 x T^) \ {(1,1)} with the multiplication 


(s,s) • {t,t) 


{l,s ■ t) if s = t = 1 
(s • t, 1) otherwise 
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where 1 denotes the identity in \ T and 1 denotes the identity in \ T. By 
construction, the semigroup S has 2n^2” + 2re + 1 < 4n^2"' elements. The morphism 
g: A~^ S defined by ^(c) = {h{c), 1) for c S B and g{c) = (1, h{c)) for c € B is 
surjective. 

Consider the set F = (T x 1) x (1 x D). We will show that F contains more than 
c • elements, that each element of T is a linked pair of S and that no two 

different elements of F are conjugate, thereby proving the claim. 

We start with the cardinality of F. We have \F\ > = j7,222«—An+An-i ^ 

16cn'^(2"')^“'^ > c(4n^2"’)^“^ > c|S'|^~'^, where the second inequality follows by the 
choice of n. Showing that F only consists of linked pairs is easy and is left as an 
exercise to the reader. Now consider two pairs ((s, 1), (1, e)) and ((t, 1), (1, /)) from 
F. Suppose these pairs are conjugate, i.e., there exist {x,x),{y,y) € S such that 
(s, 1) • (T, x) = (t, 1), (x, x) ■ {y, y) = (1, e) and (y, y) • (x, x) = (1, /). From the second 
equation, we see that x = y = 1. Therefore, s = t. Additionally, we have xy = e, as 
well as yx = f. Property [3] in Lemma [13] yields e = /. □ 

The optimality result now follows by using the previous construction as an input 
to the minimization algorithm. 

Proposition 15 The syntactic morphism cannot he computed in time 0(|S'p~'^) for 
any strictly positive, fixed value A G M. 

Proof. Assume there exists an algorithm and a constant c ^ 1 such that every input 
of size n = IS"! can be minimized in time T{n) ^ c • n^~^. Consider the execution of 
the algorithm on the semigroup S described in Lemma [T4] and on P = F. We denote 
by (si. Cl), (s 2 , 62 ),..., {s£, efi) the sequence of linked pairs for which the algorithm 
checks whether {si,ei) G P. We have £ ^ T{n) ^ c • and thus, there is a 

conjugacy class C such that {si,ei) 0 C for all i G Since the algorithm 

is deterministic, the execution sequence on input Q = P \ C is the same, and the 
algorithm returns, again, the trivial semigroup consisting of one element. However, 
[Q] 7 ^ AF and thus, the algorithm is incorrect. □ 


7 Language operations on morphisms 

One of the merits of strong recognition is that complementation is easy. If a mor¬ 
phism h\ A~^ —>■ S onto a finite semigroup S strongly recognizes a language L C A‘^, 
it also strongly recognizes the complement A^^ \L. As in the case of finite words, we 
can use direct products for unions and intersections. 

Another operation on languages which is of particular interest when it comes 
to converting MSO formulas to strongly recognizing morphisms are so-called length¬ 
preserving morphisms. Suppose we are given alphabets A, B and a length-preserving 
morphism tt: A'^ —)• , i.e., 7r(a) G B for all a G A. We naturally extend this 
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morphism to infinite words by setting 7 r(aia 2 ■ • •) = • • • and to languages 

L C by setting vr(L) = {'/r(a) | a € L}. 

Proposition 16 Letir: A~^ be a length-preserving morphism, let S he a finite 

semigroup and let h: A~^ S he a surjective morphism that strongly recognizes a 
language L C A^. Then there exist a semigroup T of size and a morphism 
g: —>■ T that strongly recognizes 7 r(L). 

Proof. We first define T to be the set 2'^ of all subsets of S and extend it to a 
semigroup by defining an associative multiplication X -Y = {xy \ x G X,y £ Y}. 
The morphism g: B~^ —)■ T is uniquely defined by g{a) = h{TT~^{a)) for all a € B. 

Let us now verify that g strongly recognizes vr(-L). Consider a linked pair (s,e) 
and two infinite words a,fiG g~^(s){g~^{e))‘^. By Proposition [H it suffices to show 
that a G 7 r(L) implies fi G 7 r(L). If a is contained in vr(L), we can conclude by 
Ramsey’s theorem that there exists a linked pair (t, f) of S with t G s, f G e 
and h~^{t){h~^ n L 7 ^ 0. By assumption, h strongly recognizes L and thus, 
we have h~^{t){h~^{f))^ C L. Since we know that there exists an infinite word 
UV 1 V 2 ■ ■ ■ G 7r~^{/3) such that h{u) = t and h{vi) = / for all i ^ 1, this immediately 
yields uviV 2 ■ ■ ■ G L and hence fi G vr(L). □ 


8 Experimental results 

In order to test the algorithms and constructions in practice, we implemented the 
conversion of MSO formulas into strongly recognizing morphisms. The constructions 
described in Section [7] are used to recursively convert the formulas, and all interme¬ 
diate results are minimized using the algorithm from Section O For details on MSO 
logic over infinite words and its connexion to regular languages, we refer to MM- 
The conversion to strongly recognizing morphisms instead of Biichi automata has 
the advantage that all intermediate objects can be minimized efficiently. Table [T] 
shows the size of the computed syntactic semigroup S, the number of linked pairs 
F and the size of the accepting set P (which is closed unter conjugation) for the fol¬ 
lowing three families of MSO formulas with parameter k 1 and free second-order 
variables = Xi, X 2 ,, X^: 

k 

Tk = Vx /\3y {x <y ^y G Xi) 
i=l 

k 

fik = VxVy {y = X -\- 1) ^ /\{x G Xi ^ y G Xj+i) 

i=l 

k 

Xk = yx /\{x G Xi ^ 3y {x < y A {y G W-i V y € 7fj+i))) 

i=l 
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Tk 



V’fc 



Xk 



|5| 

|F| 

|P| 

1^1 

\F\ 

1^1 

|5| 

\F\ 

1^1 

k = 2 

4 

5 

1 

12 

15 

10 

7 

14 

11 

k = 2) 

8 

22 

1 

43 

50 

41 

11 

26 

15 

k = A 

16 

74 

1 

148 

163 

146 

17 

61 

30 

k = 5 

32 

232 

1 

539 

570 

537 

41 

227 

85 

k = 6 

64 

710 

1 

1863 

1926 

1861 

105 

716 

184 


Table 1: Experimental results for different parameter values 


All computations were made on a Intel Core i5-3320M with 4GiB of RAM. The 
execution time was less than three seconds for each formula. 


9 Summary and Outlook 

We described several algorithms for weakly recognizing morphisms and strongly rec¬ 
ognizing morphisms over infinite words. Our tests indicate that strongly recognizing 
morphisms, when combined with the minimization algorithm presented in Sectional 
are a practical alternative to automata-based models when it comes to deciding 
properties of MSO formulas. 

Some of the algorithms leave room for optimization. In particular, it would be 
interesting to see whether there is a linear-time algorithm to compute conjugacy 
classes and whether the running time of the algorithm described in Section [S] can be 
improved to 0{\A\ ■ |S'^|). 
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